Search CORE

19 research outputs found

Depth CNNs for RGB-D scene recognition: learning from scratch better than transferring from RGB-CNNs

Author: Herranz Luis
Jiang Shuqiang
Song Xinhang
Publication venue
Publication date: 12/02/2017
Field of study

Scene recognition with RGB images has been extensively studied and has reached very remarkable recognition levels, thanks to convolutional neural networks (CNN) and large scene datasets. In contrast, current RGB-D scene data is much more limited, so often leverages RGB large datasets, by transferring pretrained RGB CNN models and fine-tuning with the target RGB-D dataset. However, we show that this approach has the limitation of hardly reaching bottom layers, which is key to learn modality-specific features. In contrast, we focus on the bottom layers, and propose an alternative strategy to learn depth features combining local weakly supervised training from patches followed by global fine tuning with images. This strategy is capable of learning very discriminative depth-specific features with limited depth images, without resorting to Places-CNN. In addition we propose a modified CNN architecture to further match the complexity of the model and the amount of data available. For RGB-D scene recognition, depth and RGB features are combined by projecting them in a common space and further leaning a multilayer classifier, which is jointly optimized in an end-to-end network. Our framework achieves state-of-the-art accuracy on NYU2 and SUN RGB-D in both depth only and combined RGB-D data.Comment: AAAI Conference on Artificial Intelligence 201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Joint Learning of CNN and LSTM for Image Captioning

Author: Jian Sun
Shuqiang Jiang
Xiangyang Li
Xinhang Song
Xue Li
Yongqing Zhu
Publication venue
Publication date: 02/04/2020
Field of study

Abstract. In this paper, we describe the details of our methods for the participation in the subtask of the ImageCLEF 2016 Scalable Image Annotation task: Natural Language Caption Generation. The model we used is the combination of a procedure of encoding and a procedure of decoding, which includes a Convolutional neural network(CNN) and a Long Short-Term Memory(LSTM) based Recurrent Neural Network. We first train a model on the MSCOCO dataset and then fine tune the model on different target datasets collected by us to get a more suitable model for the natural language caption generation task. Both of the parameters of CNN and LSTM are learned together

CiteSeerX

Multi-Scale Multi-Feature Context Modeling for Scene Recognition in the Semantic Manifold

Author: Luis Herranz
Shuqiang Jiang
Xinhang Song
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Image Representations With Spatial Object-to-Object Relations for RGB-D Scene Recognition

Author: Bohan Wang
Chengpeng Chen
Gongwei Chen
Shuqiang Jiang
Xinhang Song
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

Crossref

Towards Domain-Specific Knowledge Graph Construction for Flight Control Aided Maintenance

Author: Chuanyou Li
Mingzhe Song
Shance Luo
Wei Li
Xinhang Yang
Publication venue: MDPI AG
Publication date: 01/12/2022
Field of study

Flight control is a key system of modern aircraft. During each flight, pilots use flight control to control the forces of flight and also the aircraft’s direction and attitude. Whether flight control can work properly is closely related to safety such that daily maintenance is an essential task of airlines. Flight control maintenance heavily relies on expert knowledge. To facilitate knowledge achievement, aircraft manufacturers and airlines normally provide structural manuals for consulting. On the other hand, computer-aided maintenance systems are adopted for improving daily maintenance efficiency. However, we find that grass-roots engineers of airlines still inevitably consult unstructured technical manuals from time to time, for example, when meeting an unusual problem or an unfamiliar type of aircraft. Achieving effective knowledge from unstructured data is inefficient and inconvenient. Aiming at the problem, we propose a knowledge-graph-based maintenance prototype system as a complementary solution. The knowledge graph we built is dedicated for unstructured manuals referring to flight control. We first build ontology to represent key concepts and relation types and then perform entity-relation extraction adopting a pipeline paradigm with natural language processing techniques. To fully utilize domain-specific features, we present a hybrid method consisting of dedicated rules and a machine learning model for entity recognition. As for relation extraction, we leverage a two-stage Bi-LSTM (bi-directional long short-term memory networks) based method to improve the extraction precision by solving a sample imbalanced problem. We conduct comprehensive experiments to study the technical feasibility on real manuals from airlines. The average precision of entity recognition reaches 85%, and the average precision of relation extraction comes to 61%. Finally, we design a flight control maintenance prototype system based on the knowledge graph constructed and a graph database Neo4j. The prototype system takes alarm messages represented in natural language as the input and returns maintenance suggestions to serve grass-roots engineers

Directory of Open Access Journals

Towards Domain-Specific Knowledge Graph Construction for Flight Control Aided Maintenance

Author: Chuanyou Li
Mingzhe Song
Shance Luo
Wei Li
Xinhang Yang
Publication venue: 'MDPI AG'
Publication date: 12/12/2022
Field of study

Multidisciplinary Digital Publishing Institute

Multipath Convolutional-Recursive Neural Networks for Object Recognition

Author: Herranz Luis
Jiang Shuqiang
Li Xiangyang
Shi Zhiping
Song Xinhang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Part 8: Pattern RecognitionInternational audienceExtracting good representations from images is essential for many computer vision tasks. While progress in deep learning shows the importance of learning hierarchical features, it is also important to learn features through multiple paths. This paper presents Multipath Convolutional-Recursive Neural Networks(M-CRNNs), a novel scheme which aims to learn image features from multiple paths using models based on combination of convolutional and recursive neural networks (CNNs and RNNs). CNNs learn low-level features, and RNNs, whose inputs are the outputs of the CNNs, learn the efficient high-level features. The final features of an image are the combination of the features from all the paths. The result shows that the features learned from M-CRNNs are a highly discriminative image representation that increases the precision in object recognition

Crossref

Spatio-Temporal Memory Attention for Image Captioning

Author: Boyue Wang
Cheng Xu
Junzhong Ji
Xiaodan Zhang
Xinhang Song
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

Crossref